在神经网络应用中,不足的培训样本是一个常见的问题。尽管数据增强方法至少需要最少数量的样本,但我们提出了一种基于新颖的,基于渲染的管道来合成带注释的数据集。我们的方法不会修改现有样本,而是合成全新样本。提出的基于渲染的管道能够在全自动过程中生成和注释合成和部分真实的图像和视频数据。此外,管道可以帮助获取真实数据。拟议的管道基于渲染过程。此过程生成综合数据。部分实现的数据使合成序列通过在采集过程中合并真实摄像机使综合序列更接近现实。在自动车牌识别的背景下,广泛的实验验证证明了拟议的数据生成管道的好处,尤其是对于具有有限的可用培训数据的机器学习方案。与仅在实际数据集中训练的OCR算法相比,该实验表明,角色错误率和错过率分别从73.74%和100%和14.11%和41.27%降低。这些改进是通过仅对合成数据训练算法来实现的。当另外合并真实数据时,错误率可以进一步降低。因此,角色错误率和遗漏率可以分别降低至11.90%和39.88%。在实验过程中使用的所有数据以及针对自动数据生成的拟议基于渲染的管道公开可用(URL将在出版时揭示)。
translated by 谷歌翻译
在本文中,我们提出了一个用于光学特征识别(OCR)的数据增强框架。所提出的框架能够合成新的视角和照明方案,从而有效地丰富任何可用的OCR数据集。它的模块化结构允许修改以符合单个用户需求。该框架使得可以舒适地扩展可用数据集的扩大因子。此外,所提出的方法不仅限于单帧OCR,但也可以应用于视频OCR。我们通过扩大普通BRNO移动OCR数据集的15%子集来证明框架的性能。我们提出的框架能够利用OCR应用程序的性能,尤其是对于小型数据集。应用提出的方法,在字符错误率(CER)方面提高了多达2.79个百分点,并在子集中获得了高达7.88个百分点。特别是可以改善对具有挑战性的文本线条的认识。该类别的CER可能会降低14.92个百分点,而该级别的CER可下降到18.19个百分点。此外,与原始的非仪式完整数据集相比,使用建议方法的15%子集进行训练时,我们能够达到较小的错误率。
translated by 谷歌翻译
法医车牌识别(FLPR)仍然是在法律环境(例如刑事调查)中的公开挑战,在刑事调查中,不可读取的车牌(LPS)需要从高度压缩和/或低分辨率录像(例如监视摄像机)中解密。在这项工作中,我们提出了一个侧面信息变压器体系结构,该结构嵌入了输入压缩级别的知识,以改善在强压缩下的识别。我们在低质量的现实世界数据集上显示了变压器对车牌识别(LPR)的有效性。我们还提供了一个合成数据集,其中包括强烈退化,难以辨认的LP图像并分析嵌入知识对其的影响。该网络的表现优于现有的FLPR方法和标准最先进的图像识别模型,同时需要更少的参数。对于最严重的降级图像,我们可以将识别提高多达8.9%。
translated by 谷歌翻译
免费可用且易于使用的音频编辑工具使执行音频剪接变得直接。可以通过结合同一人的各种语音样本来说服伪造。在考虑错误信息时,在公共部门都很重要,并且在法律背景下以验证证据的完整性很重要。不幸的是,用于音频剪接的大多数现有检测算法都使用手工制作的功能并做出特定的假设。但是,刑事调查人员经常面临来自未知特征不明的来源的音频样本,这增加了对更普遍适用的方法的需求。通过这项工作,我们的目标是朝着不受限制的音频剪接检测迈出第一步,以满足这一需求。我们以可能掩盖剪接的后处理操作的形式模拟各种攻击方案。我们提出了一个用于剪接检测和定位的变压器序列到序列(SEQ2SEQ)网络。我们的广泛评估表明,所提出的方法的表现优于现有的剪接检测方法[3,10]以及通用网络效率网络[28]和regnet [25]。
translated by 谷歌翻译
图像取证中的一项常见任务是检测剪接图像,其中多个源图像组成一个输出图像。大多数当前最佳性能的剪接探测器都利用高频伪像。但是,在图像受到强大的压缩后,大多数高频伪像不再可用。在这项工作中,我们探索了一种剪接检测的替代方法,该方法可能更适合于野外图像,但要受到强烈的压缩和下采样的影响。我们的建议是建模图像的颜色形成。颜色的形成很大程度上取决于场景对象的规模的变化,因此依赖于高频伪像。我们学到了一个深度度量空间,一方面对照明颜色和摄像机的白点估计敏感,但另一方面对物体颜色的变化不敏感。嵌入空间中的大距离表明两个图像区域源于不同的场景或不同的相机。在我们的评估中,我们表明,所提出的嵌入空间的表现优于受到强烈压缩和下采样的图像的最新状态。我们在另外两个实验中确认了度量空间的双重性质,即既表征采集摄像头和场景发光颜色。因此,这项工作属于基于物理和统计取证的交集,双方都受益。
translated by 谷歌翻译
Figure 1: FaceForensics++ is a dataset of facial forgeries that enables researchers to train deep-learning-based approaches in a supervised fashion. The dataset contains manipulations created with four state-of-the-art methods, namely, Face2Face, FaceSwap, DeepFakes, and NeuralTextures.
translated by 谷歌翻译
Robotic teleoperation is a key technology for a wide variety of applications. It allows sending robots instead of humans in remote, possibly dangerous locations while still using the human brain with its enormous knowledge and creativity, especially for solving unexpected problems. A main challenge in teleoperation consists of providing enough feedback to the human operator for situation awareness and thus create full immersion, as well as offering the operator suitable control interfaces to achieve efficient and robust task fulfillment. We present a bimanual telemanipulation system consisting of an anthropomorphic avatar robot and an operator station providing force and haptic feedback to the human operator. The avatar arms are controlled in Cartesian space with a direct mapping of the operator movements. The measured forces and torques on the avatar side are haptically displayed to the operator. We developed a predictive avatar model for limit avoidance which runs on the operator side, ensuring low latency. The system was successfully evaluated during the ANA Avatar XPRIZE competition semifinals. In addition, we performed in lab experiments and carried out a small user study with mostly untrained operators.
translated by 谷歌翻译
The purpose of this work was to tackle practical issues which arise when using a tendon-driven robotic manipulator with a long, passive, flexible proximal section in medical applications. A separable robot which overcomes difficulties in actuation and sterilization is introduced, in which the body containing the electronics is reusable and the remainder is disposable. A control input which resolves the redundancy in the kinematics and a physical interpretation of this redundancy are provided. The effect of a static change in the proximal section angle on bending angle error was explored under four testing conditions for a sinusoidal input. Bending angle error increased for increasing proximal section angle for all testing conditions with an average error reduction of 41.48% for retension, 4.28% for hysteresis, and 52.35% for re-tension + hysteresis compensation relative to the baseline case. Two major sources of error in tracking the bending angle were identified: time delay from hysteresis and DC offset from the proximal section angle. Examination of these error sources revealed that the simple hysteresis compensation was most effective for removing time delay and re-tension compensation for removing DC offset, which was the primary source of increasing error. The re-tension compensation was also tested for dynamic changes in the proximal section and reduced error in the final configuration of the tip by 89.14% relative to the baseline case.
translated by 谷歌翻译
Learning enabled autonomous systems provide increased capabilities compared to traditional systems. However, the complexity of and probabilistic nature in the underlying methods enabling such capabilities present challenges for current systems engineering processes for assurance, and test, evaluation, verification, and validation (TEVV). This paper provides a preliminary attempt to map recently developed technical approaches in the assurance and TEVV of learning enabled autonomous systems (LEAS) literature to a traditional systems engineering v-model. This mapping categorizes such techniques into three main approaches: development, acquisition, and sustainment. We review the latest techniques to develop safe, reliable, and resilient learning enabled autonomous systems, without recommending radical and impractical changes to existing systems engineering processes. By performing this mapping, we seek to assist acquisition professionals by (i) informing comprehensive test and evaluation planning, and (ii) objectively communicating risk to leaders.
translated by 谷歌翻译
In inverse reinforcement learning (IRL), a learning agent infers a reward function encoding the underlying task using demonstrations from experts. However, many existing IRL techniques make the often unrealistic assumption that the agent has access to full information about the environment. We remove this assumption by developing an algorithm for IRL in partially observable Markov decision processes (POMDPs). We address two limitations of existing IRL techniques. First, they require an excessive amount of data due to the information asymmetry between the expert and the learner. Second, most of these IRL techniques require solving the computationally intractable forward problem -- computing an optimal policy given a reward function -- in POMDPs. The developed algorithm reduces the information asymmetry while increasing the data efficiency by incorporating task specifications expressed in temporal logic into IRL. Such specifications may be interpreted as side information available to the learner a priori in addition to the demonstrations. Further, the algorithm avoids a common source of algorithmic complexity by building on causal entropy as the measure of the likelihood of the demonstrations as opposed to entropy. Nevertheless, the resulting problem is nonconvex due to the so-called forward problem. We solve the intrinsic nonconvexity of the forward problem in a scalable manner through a sequential linear programming scheme that guarantees to converge to a locally optimal policy. In a series of examples, including experiments in a high-fidelity Unity simulator, we demonstrate that even with a limited amount of data and POMDPs with tens of thousands of states, our algorithm learns reward functions and policies that satisfy the task while inducing similar behavior to the expert by leveraging the provided side information.
translated by 谷歌翻译